Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Implement new rules design #119

Conversation

dimitris-athanasiou
Copy link
Contributor

@dimitris-athanasiou dimitris-athanasiou commented Jun 5, 2018

The ml-cpp side of elastic/elasticsearch#31110.

Release note: Detectors now support custom rules that enable the user to improve machine learning results by providing some domain-specific knowledge in the form of rule.

Copy link
Contributor

@tveasey tveasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some tidy up suggestions, but basically LGTM.

return false;
}

for (auto& member : scopeObject.GetObject()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: looks like it can be const auto&

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried this but it doesn't compile. I guess I'm using public fields of member and the square brackets operator which probably is not const.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because RapidJSON doesn't have begin() and end() functions that return const iterators.

Read downwards from "Just one problem ...." in https://mbevin.wordpress.com/2012/11/14/range-based-for/

In RapidJSON there is this:

#if RAPIDJSON_HAS_CXX11_RANGE_FOR
    MemberIterator begin() const { return value_.MemberBegin(); }
    MemberIterator end() const { return value_.MemberEnd(); }
#endif

But sadly no corresponding functions that return ConstMemberIterator.

It would be trivial to add - you could submit a PR to RapidJSON if you want to be famous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll hold on to get famous as a rock star instead!

: m_Action(E_FilterResults), m_Conditions(), m_ConditionsConnective(E_Or),
m_TargetFieldName(), m_TargetFieldValue() {
m_Conditions.reserve(1);
CDetectionRule::CDetectionRule() : m_Action(E_SkipResult), m_Conditions() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can initialise m_Action to E_SkipResult where it is declared in the class body and then keep the auto generated default constructor.

@@ -51,99 +42,48 @@ bool CDetectionRule::apply(ERuleAction action,
return false;
}

if (this->isInScope(model, pid, cid) == false) {
if (m_Scope.check(model, pid, cid) == false) {
return false;
}

for (std::size_t i = 0; i < m_Conditions.size(); ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: switch to range based for loop.

bool CRuleScope::check(const CAnomalyDetectorModel& model, std::size_t pid, std::size_t cid) const {

const CDataGatherer& gatherer = model.dataGatherer();
for (auto& scopeField : m_Scope) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const auto&


std::string CRuleScope::print() const {
std::string result{""};
if (m_Scope.empty() == false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this if is unnecessary, begin can safely be called and you don't enter the loop in this case so no harm done.

}

void CRuleScope::exclude(std::string field, const core::CPatternSet& filter) {
m_Scope.push_back({field, TPatternSetCRef(filter), E_Exclude});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: emplace_back

}

void CRuleScope::include(std::string field, const core::CPatternSet& filter) {
m_Scope.push_back({field, TPatternSetCRef(filter), E_Include});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: emplace_back

namespace ml {
namespace model {

CRuleScope::CRuleScope() : m_Scope() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use CRuleScope() = default in header.

}
std::string result = this->print(m_AppliesTo);
result += " " + this->print(m_Condition.s_Op) + " " +
core::CStringUtils::typeToString(m_Condition.s_Threshold);
return result;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: naming the result doesn't gain anything in readability IMO, so I'd return a temporary here to guaranty RVO, i.e.

return this->print(m_AppliesTo) + " " + this->print(m_Condition.s_Op) + " " + 
       core::CStringUtils::typeToString(m_Condition.s_Threshold);

@dimitris-athanasiou
Copy link
Contributor Author

@droberts195 @tveasey Could you have another look at this please?

@dimitris-athanasiou
Copy link
Contributor Author

Noting here that I had a go at trying to reuse the state restore traverser paradigm to parse the rules. However, this didn't work out as CJsonStateRestoreTraverser is not handling arrays. It's too risky to fiddle with that just for the sake of parsing the rules so I'm leaving it as is for the time being.

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I left a couple of nits about unnecessary copying. In addition to these there will still be unnecessary copying because core::CTriple doesn't have constructors that take rvalue references. Let's not complicate this PR by changing that here too, but it's worth noting that until all our classes are capable of efficiently consuming moved objects it might be better to stick to passing strings by const reference instead of value.

namespace model {

void CRuleScope::include(std::string field, const core::CPatternSet& filter) {
m_Scope.emplace_back(field, TPatternSetCRef(filter), E_Include);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're passing field by value you should move it into m_Scope, i.e.:

    m_Scope.emplace_back(std::move(field), TPatternSetCRef(filter), E_Include);

Same in exclude immediately below.

void CDetectionRule::addCondition(const CRuleCondition& condition) {
m_Conditions.push_back(condition);
void CDetectionRule::includeScope(std::string field, const core::CPatternSet& filter) {
m_Scope.include(field, filter);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use std::move(field), as it was passed by value and is not used anywhere else.

Same in excludeScope immediately below.

@tveasey
Copy link
Contributor

tveasey commented Jun 12, 2018

Regarding these cases: I agree, I don't think we should abandon passing by const reference, for consistency if nothing else. However, this is basically LGTM2.

@dimitris-athanasiou dimitris-athanasiou merged commit 8ce8abf into elastic:master Jun 12, 2018
@dimitris-athanasiou dimitris-athanasiou deleted the implement-new-rules-design branch June 12, 2018 13:26
dimitris-athanasiou added a commit that referenced this pull request Jun 12, 2018
The ml-cpp side of the new rules implementation.
@sophiec20 sophiec20 added :ml and removed :ml labels Jun 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants